Resilient Decentralized Consensus
Consensus is one of the fundamental principles that we can build on when designing distributed systems: Whenever a group of processes establishes consensus, all processes agree on the current state of the system. This allows building distributed systems in which a group of processes acts as one coherent state machine, also referred to as state-machine replication.
Since consensus algorithms (such as Paxos or Raft) can become rather complex, most applications rely on external tools such as databases or dedicated replication managers to achieve consensus. This limits application flexibility and makes it hard to optimize the consensus algorithm towards a specific application scenario.
Goal
When picked as a seminar, the goal of this topic is to get an overview of different consensus algorithms and compare them, especially with regards to their system assumptions. Which algorithm works best in which setting, e.g., data centers versus peer-to-peer applications?
When picked as a project, the goal of this topic is to implement several algorithms and develop a library of distributed consensus algorithms which can be embedded directly into a distributed program. This allows programmers to select application-specific algorithms which are fine-tuned for a specific use case.
The implementation of the algorithms should follow the principle of immutable replicated state (as seen in append-only log stores or CRDTs) to simplify reasoning and decrease complexity. Furthermore, it should focus on modularity and abstractions, such that several basic building blocks can be combined in multiple ways to achieve different algorithms.
Starting Points
- Paxos Made Simple, a general introduction to the consensus problem and the Paxos consenus algorithm
- A Generalised Solution to Distributed Consensus, an overview, how we can model consensus with immutable replicated state
- Knowledge and common knowledge in a distributed environment, a theoretical discussion of the limits and principles of consensus/agreement
- Chemistry behind Agreement, an overview of the building blocks of modern consensus algorithms
- Distributed Locking as a Data Type, an example of how we can use CRDTs to implement consensus algorithms
Example Tool
- etcd A distributed key-value store and an implementation of the Raft consensus algorithm
- Apache Zookeeper A coordination service, using an atomic-broadcast protocol